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ABSTRACT 

At Golden College (California) , student vriting 
samples are nolistically scored by pairs of judges on a six-point 
scale. Judges are allowed to use plus and minus figures, thus 
converting tne integer scale to a decimal scale of evaluation, in 
1991, 499 writing samples written as part of the piacenrant testing 
process for students in the Coast Coiamunity College District's SOAR 
program were analyzed for reliable scoring on the part of the judges. 
Two procedures were used to assess the extent to which judgments of 
writing samples were consistent. The first method entailed 
calculating the difference between the two ratings of all writing 
samples. For 34.4% of the pairs, the ratings from independent judges 
were identical. Ratings differed by one-third of a point in 30.9% of 
the cases, by two-thirds of a point in 15.4% of the cases, and by 
exactly one point in 13.4% of the cases, overall, ratings differed by 
one point or less in 94.1% of the cases, exceeding the minimal 
standard of 90% recommended by the California coiranur"»ty colleges 
(CCC) . The second procedure used was a Pearson correlation 
coefficient which assessed the relationship between paired ratings 
for the writing samples. The correlation between ratings was 
moderately strong, and positive (r«.76), exceeding the minimal 
standard of .75 recommended by the CCC. The data supported the 
hypothesis that the ratings from independent judges were made in a 
consistent manner. Two references and an. appendix of related data are 
attached. (JNC) 
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Judgments of Plaeesent Writing Samples at Golden west college: 
An Evaluation of Inter-Rater Reliability 

Steven Isonio, PhD 

Background . A number of specific assessment validation 
requirements have been put forth regarding multiple-choice, 
objective assessment instruments (California Community Colleges, 
1990). Additionally, colleges that use writing samples in their 
placement rules have an obligation to demonstrate that these 
essay tests meet certain minimal standards for reliability. 

At Golden West College, as is typically the case, writing 
samples are holistically scored by pairs of judges. A 6-point 
scale is used with higher scores indicating a more thoughtful 
response to the theme topic and general mastery of most of the 
grammar and usage conventions of standard English. This scale is 
applied independently to the writing sample by both judges. In 
this context, reliability concerns the extent to which 
evaluations provided by different judges of the same writing 
sample are in agreement. 

Method. A total of 499 writing samples written as a part of 
the placement testing process for students participating in the 
Coast Community college District SOAR program were used for the 
analysis, in accordance with established Golden West College 
procedures, a "norming" period in which judges of writing samples 
discuss standards and expectations and evaluate a small number of 
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samples preceded the reading and evaluation of the entire set 
of writing samples. Each writing sample was read by two judges 
who independently provided a rating. In cases where the ratings 
differ by more than one point (on the 6-point scale) , an 
evaluation by a third reader wan made. The rating of the writing 
sample is then typically used ir conjunction with the objective 
test score to make a placement recommendation for the student. 
Figure 1 depicts the frequency distribution for all ratings 
applied to the writing samples in the present analysis (all 
Figures and Tables appear in the Appendix} . As can be seen, the 
modal rating is »'3", followed by "2"? the next most frequent 
ratings are n3-»», "3+", and "4-". 

The pairs of ratings of the samples were compiled for 
analysis. Since it is the practice of the judges to use and 

as a part of some ratings (e.g., a "3+" and "4-" are 
sometimes used to make fine distinctions between "3" and "4"), a 
conversion to decimals was necessary. Table 1 shows the values 
used for translating original ratings into converted ratings 
(with decimals) for purposes of the analysis. 

Results. Two procedures were used to assess the extent to 
which judgments of writing samples are consistent. The first 
method entailed calculating the difference between the two 
converted ratings of all writing samples. A total of 499 pairs 
were analyzed. As Figure 2 indicates, for 34.4% of the pairs, 
the ratings from independent judges were identical. Ratings 



differed by one-third of a point in 30.9% of the cases, by two- 
thirds of a point in 15.4% of the cases, and by exactly one point 
in 13.4% of the cases. Thus, in 94.1% of the cases, ratings 
differed by one-point or less. This exceeds the recommended 
minimal standard of 90% (California Community Colleges, 1990) . 

A Pearson correlation coefficient was calculated to assess 
the relationship between paired ratings for the writing samples, 
as described in Assessment Validation Project Local Research 
Options, Design 18 (Matriculation Assessment Work Group, 1991). 
The correlation between ratings for the 499 writing samples was 
moderately strong, and positive [£ = .76, p < .001]. This value 
exceeds the recommended minimal standard of .75 (California 
Community Colleges, 1990) . 

Discussion . Evidence based upon writing samples produced 
during the placement testing portion of the SOAR program 
indicates that the ratings from independent judges are indeed 
made in a consistent manner. Both in terms of the proportion of 
pairs of ratings within one point of each other and the 
correlation between ratings, the degree of consistency exceeds 
the minimal standards specified by the California Community 
Colleges Chancellor's Office. 
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Table 1 

Conversi on From 6-point Scale to Decimal Scale 
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1- 


0.67 
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2- 
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Figure 1. 
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Figure 2. 
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